Automatically Acquiring Phrase Structure Using Distributional Analysis

نویسندگان

  • Eric Brill
  • Mitchell Marcus
چکیده

In this paper, we present evidence that the acquisition of the phrase structure of a natural language is possible without supervision and with a very small initial grammar. We describe a language learner that extracts distributional information from a corpus annotated with parts of speech and is able to use this extracted information to accurately parse short sentences. The phrase structure learner is part of an ongoing project to determine just how much knowledge of language can be learned solely through distributional analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus-Oriented Grammar Development for Acquiring a Head-Driven Phrase Structure Grammar from the Penn Treebank

This paper describes a method of semi-automatically acquiring an English HPSG grammar from the Penn Treebank. First, heuristic rules are employed to annotate the treebank with partially-specified derivation trees. Lexical entries are automatically extracted from the annotated corpus by inversely applying schemata to partially-specified derivation trees.

متن کامل

Distributional phrase structure induction

Unsupervised grammar induction systems commonly judge potential constituents on the basis of their effects on the likelihood of the data. Linguistic justifications of constituency, on the other hand, rely on notions such as substitutability and varying external contexts. We describe two systems for distributional grammar induction which operate on such principles, using part-of-speech tags as t...

متن کامل

Probabilistic Distributional Semantics with Latent Variable Models

We describe a probabilistic framework for acquiring selectional preferences of linguistic predicates and for using the acquired representations to model the effects of context on word meaning. Our framework uses Bayesian latent-variable models inspired by, and extending, the well-known Latent Dirichlet Allocation (LDA) model of topical structure in documents; when applied to predicate–argument ...

متن کامل

Verb Phrase Ellipsis using Frobenius Algebras in Categorical Compositional Distributional Semantics

We sketch the basis of a categorical compositional distributional semantic approach to the analysis of verb phrase ellipsis.

متن کامل

Combining Syntactic Co-occurrences and Nearest Neighbours in Distributional Methods to Remedy Data Sparseness.

The task of automatically acquiring semantically related words have led people to study distributional similarity. The distributional hypothesis states that words that are similar share similar contexts. In this paper we present a technique that aims at improving the performance of a syntax-based distributional method by augmenting the original input of the system (syntactic co-occurrences) wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992